Discriminative training of finite state decoding graphs
نویسندگان
چکیده
Automatic Speech Recognition systems integrate three main knowledge sources: acoustic models, pronunciation dictionary and language models. In contrast to common practices, where each source is optimized independently, then combined in a finite-state search space, we investigate here a training procedure which attempts to adjust (some of) the parameters after, rather than before, combination. To this end, we adapted a discriminative training procedure originally devised for language models to the more general case of arbitrary finitestate graphs. Preliminary experiments performed on a simple name recognition task demonstrate the potential of this approach and suggest possible improvements.
منابع مشابه
Joint optimization on decoding graphs using minimum classification error criterion
Motivated by the inherent correlation between the speech features and their lexical words, we propose in this paper a new framework for learning the parameters of the corresponding acoustic and language models jointly. The proposed framework is based on discriminative training of the models’ parameters using minimum classification error criterion. To verify the effectiveness of the proposed fra...
متن کاملOptimization on decoding graphs by discriminative training
The three main knowledge sources used in the automatic speech recognition (ASR), namely the acoustic models, a dictionary and a language model, are usually designed and optimized in isolation. Our previous work [1] proposed a methodology for jointly tuning these parameters, based on the integration of the resources as a finite-state graph, whose transition weights are trained discriminatively. ...
متن کاملRASR – The RWTH Aachen University Open Source Speech Recognition Toolkit
RASR is the open source version of the well-proven speech recognition toolkit developed and used at RWTH Aachen University. The current version of the package includes state of the art speech recognition technology for acoustic model training and decoding. Speaker adaptation, speaker adaptive training, unsupervised training, discriminative training, lattice processing tools, flexible signal ana...
متن کاملA unified framework for translation and understanding allowing discriminative joint decoding for multilingual speech semantic interpretation
Probabilistic approaches are now widespread in most natural language processing applications and selection of a particular approach usually depends on the task at hand. Targeting speech semantic interpretation in a multilingual context, this paper presents a comparison between the state-of-the-art methods used for machine translation and speech understanding. This comparison justifies our propo...
متن کاملDiscriminative training of WFST factors with application to pronunciation modeling
One of the most popular speech recognition architectures consists of multiple components (like the acoustic, pronunciation and language models) that are modeled as weighted finite state transducer (WFST) factors in a cascade. These factor WFSTs are typically trained in isolation and combined efficiently for decoding. Recent work has explored jointly estimating parameters for these models using ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005